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Calibration  of  quantum  computers  is  essential  to  the  effective  utilisation  of  their  quantum  resources. 
Specifically,  the  performance  of  quantum  annealers  is  likely  to  be  significantly  impaired  by  noise  in  their 
programmable  parameters,  effectively  misspecification  of  the  computational  problem  to  be  solved, 
often  resulting  in  spurious  suboptimal  solutions.  We  developed  a  strategy  to  determine  and  correct 
persistent,  systematic  biases  between  the  actual  values  of  the  programmable  parameters  and  their 
user-specified  values.  We  applied  the  recalibration  strategy  to  two  D-WaveTwo  quantum  annealers,  one 
at  NASA  Ames  Research  Center  in  Moffett  Field,  California,  and  another  at  D-Wave  Systems  in  Burnaby, 
Canada.  We  show  that  the  recalibration  procedure  not  only  reduces  the  magnitudes  of  the  biases  in 
the  programmable  parameters  but  also  enhances  the  performance  of  the  device  on  a  set  of  random 
benchmark  instances. 


Quantum  annealing  (QA)  is  a  metaheuristic  for  solving  combinatorial  optimization  problems1.  The  recent  intro¬ 
duction  of  QA  hardware  by  D-Wave  Systems2,3  has  invigorated  theoretical  and  experimental  research  into  the 
computational  power  and  practical  implementation  challenges  of  the  QA  paradigm.  Current  research  studies  focus 
on  both  fundamental  and  applied  aspects,  including  application  to  real-world  problems4-8,  criteria  for  detecting 
quantum  speedup9,  the  computational  role  of  quantum  tunneling10,  error-supression11,  the  relationship  between 
classical  simulated  annealing  and  quantum  annealing12-17,  spin-glass  perspectives  on  the  hardness  of  computational 
problems18-22,  and  programming  strategies  that  address  intrinsic  noise23,24. 

The  quantum  annealers  used  for  this  study  are  of  the  second  generation  of  D-Wave  devices,  also  called  D-Wave 
Two2:  one  located  at  NASA  Ames  Research  Center  in  Moffett  Field,  California,  (“NASA  device”),  and  another 
located  at  D-Wave  Systems  in  Burnaby,  Canada  (“Burnaby  device”).  These  consist  of  64  unit  cells  of  a  previously 
characterized  eight-qubit  unit  cell3,25.  In  the  NASA  and  Burnaby  devices,  post-fabrication  characterization  deter¬ 
mined  that  only  509  and  424  qubits,  respectively,  out  of  the  512  qubit  arrays  can  be  reliably  used  for  computation. 
The  array  of  coupled  superconducting  flux  qubits  is,  effectively,  an  artificial  Ising  spin  system  with  programmable 
spin-spin  couplings  and  longitudinal  and  transverse  magnetic  fields.  It  is  designed  to  solve  instances  of  the  following 
(NP-hard26)  classical  optimization  problem:  Given  a  set  of  local  fields  {/i2]C  [—2,  2]  and  couplings  {/y}C  [  —  1, 1], 
find  the  assignment  s*  =  sfs2*  •  •  -  s  *  C  {±  1}M,  that  minimizes  the  objective  function, 

Rising  («)  =  I]  hiSi  +  E  JijsiSj ■ 

1  <i<n  1  <i<j<n  (1) 

Finding  the  optimal  s *  is  equivalent  to  finding  the  ground  state  of  the  corresponding  quantum  Ising  Hamiltonian 
Hising  =  Hi  hrf  +  Hicj  JijCrfcTj,  where  a f  is  the  Pauli  z  operator  acting  on  the  fth  spin.  More  details  of  QA  can 
be  found  in  the  supplementary  information. 

Currently,  D-Wave  devices  are  only  calibrated  at  the  level  of  ensuring  that  the  low-level  control  circuitry  has 
its  intended  effect  on  the  physical  quantities  like  current,  flux,  etc.,  that  it  is  meant  to  control27  (Lanting,  T.  Private 
communication,  2015).  Early  research  into  the  performance  of  D-Wave  devices  has  indicated  the  presence  of 
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significant  imprecision  in  the  setting  of  the  fields  that  define  the  problem  to  be  solved,  a  significant  impairment 
to  the  successful  solution  of  the  problem13’17,19’23.  Recently,  some  work  has  used  a  phenomenological  noise  model 
of  the  fields  {/z2}  and  {/^j  in  which  the  distributions  of  the  deviations  from  the  programmed  values  are  given  by 
Gaussians  with  means  zero  and  standard  deviations,  respectively,  of  0.05  and  0.035  (in  units  of  the  maximal  /y), 
independently  instantiated  for  each  qubit  and  anneal  and  constant  throughout  the  course  of  a  given  anneal.  The 
parameters  of  the  Gaussians  were  derived  by  adding  in  quadrature  the  variances  of  several  known  microscopic 
sources  of  noise  (Lanting,  T.  Private  communication,  2015).  This  model  has  been  used  in  an  attempt  to  explain 
the  failure  rate  of  D- Wave  devices  as  partly  due  to  misspecification  of  the  programmable  values.  There  are  many 
sources  of  noise  in  quantum  annealers,  each  with  a  different  effect  and  time  scale,  and  we  address  here  only  one 
manifestation.  The  variances  we  report  in  this  paper  are  in  a  sense  incomparable  to  those  just  mentioned,  and 
relevant  only  within  the  context  of  the  experiments  described  below. 

The  presence  of  systematic  biases  in  quantum  annealers  has  been  reported  elsewhere24.  The  biases  referred 
to  there  are  fundamentally  different  in  nature  from  the  ones  address  here,  in  that  the  former  are  collective  biases 
on  the  qubits  of  ferromagnetic  chains  that  depend  on  the  strength  and  topology  of  the  couplings  therein  and  are 
due  to  the  noise  specifically  caused  by  those  couplings,  and  they  must  be  determined  anew  for  each  embedding 
topology  used.  In  this  work,  we  present  a  methodology  for  determining,  in  parallel  and  using  a  relatively  small 
amount  of  total  annealing  time,  the  persistent  (see  Fig.  S3),  systematic  biases  in  all  of  the  individually  available 
programmable  parameters  of  a  quantum  annealer.  We  show  that  correcting  for  these  biases  produces  an  increase 
in  the  quality  of  solutions  found  on  a  set  of  random  benchmark  instances.  The  strategy  presented  here  is  the  first 
proposal  for  a  software -level  recalibration  of  the  full  device  by  the  user,  i.e.  based  only  on  the  data  from  tailored 
instances  and  without  access  to  the  low-level  control  circuitry 


Results 

Because  actual  quantum  annealers  operate  at  non-zero  temperature,  there  exists  some  threshold  for  the  values  of 
the  fields  {ht}  and  couplings  {/y}  below  which  thermal  effects  dominate  the  annealing  process.  When  the  strengths  of 
the  fields  and  couplings  are  set  sufficiently  small,  the  probabilities  of  the  final  states  of  the  qubits  are  well  described 
by  a  Boltzmann  distribution.  Roughly,  the  relevant  energy  scale  is  given  by  kT ,  where  k  is  Boltzmanns  constant 
and  T  is  an  effective  temperature,  not  necessarily  equal  to  the  device  temperature.  (Henceforth,  we  will  work  in 
units  in  which  k=  1.)  By  running  experiments  in  this  regime,  persistent  biases  in  the  values  of  the  programmed 
fields  can  be  uncovered,  as  described  in  detail  in  the  Methods  section. 

In  our  model,  h{  =  +  h /bHs  the  effective  value  of  the  local  field  of  qubit  i,  where  is  the  user-programmed 

value  and  h ^  is  the  h /^-independent  bias.  In  an  ideal  device,  =  0. 

Let  p.  (|  )  [p.  (|  |  )  ]  be  the  probability  of  qubit  i  being  in  the  spin-up  [spin-down]  state  at  the  end  of  an 

anneal  with  the  programmed  value  h\v\  A  completely  thermal  model  for  this  probability  is  given  by 


Pi0\hiv)) 


Tf) 
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where  cq  (fi/p\  h j;h\  Tt)  =  hj  Tt  =  +  h and  Tt  is  the  effective  temperature  of  qubit  i.  This  yields 


=  <*,-(&,•  p)>  h{b\  T,.)  =  -  In 


1  -P,.(T|  hjp) 

P,(  T|h/p)) 


(3) 


(We  assume  that  and  T2  are  constant  at  least  over  the  course  of  the  experiment.)  More  generally,  we  define  the 
function  a(p)  =  (1/2)  In  [(1  —  p)/p\.  Once  experimental  values  ofcq(/z/p))  are  obtained  for  various  values  of  h^p\ 
the  data  are  fit  to  obtain  the  estimates  of  hj^  and  Tt.  The  biases  of  the  couplings  between  qubits  can  be  determined 
in  a  similar  fashion.  (See  Methods  section  for  more  details). 

To  illustrate  the  efficacy  of  our  approach  in  quantum  annealers,  we  applied  our  method  to  two  D-Wave  devices, 
the  NASA  device  and  the  Burnaby  device. 


Determination  of  the  h  biases.  Figure  l(a,b)  shows  two  windows  ofa:^)  (h^)  for  one  experiment  on  the 
NASA  device.  Figure  1(b)  shows  the  range  fi(p)  E  [—0.1, 0.1]  used  in  the  calculation  of  the  biases.  There,  the  linearity 
of  c£h\  and  thus  the  accuracy  of  the  thermal  model,  is  evident;  this  is  typical  of  all  the  experiments  reported. 
Figure  1(a)  shows  a  wider  window  fi(p)  E  [—0.35, 0.35],  where  the  nonlinearity  outside  of  the  former  range  is  evident, 
indicating  the  failure  of  the  thermal  model  for  larger  magnitudes  of  fi(p),  where  annealing  dynamics  start  to  dom¬ 
inate  thermal  dynamics.  Figure  1(c)  shows  a2  ( h ^ ),  revealing  the  limited  resolution  0.025  of  the  digital- to -analog 
converters  (DACs)  used  to  implement  h2.  Such  stepping  behavior  is  typical  for  all  of  the  qubits. 


Narrowing  of  the  bias  distribution.  To  show  the  correctability  of  the  persistent  biases,  we  ran  the  experiment 
described  above  repeatedly,  each  time  attempting  to  correct  the  biases  using  estimates  thereof  from  the  prior 

iteration.  Let  h  [T^’^  ]  be  the  experimentally  determined  value  of  the  bias  h [median  temperature  T ^  ]based 
on  the  k- iteration;  by  convention,  set  =  0.  Let  fi(p,0)  be  the  desired  programmed  value.  In  the  k- th  iteration, 
we  set 
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Figure  1.  Detection  and  correction  of  systematic  biases,  (a)  From  a  single  experiment  without  correction,  the 
median  quantity  3^)  (//p)  )}  over  the  qubits,  of  the  quantities  jcq(/z^ )}  for  various  values  of  fi(p)  in  [—0.35,  0.35]. 
(b)  Same  as  (a),  but  only  using  values  of  fi(p)  in  [—0.1,  0.1].  Note  the  tightness  of  the  fit  to  a  line,  indicating  the 
validity  of  the  thermal  model,  (c)  From  the  same  experiment  as  in  (b),  the  quantity  a2  ( h ^ )  for  a  single,  typical 
qubit,  2.  Note  the  step  function  resulting  from  the  limited  precision  of  the  digital-to-analog  converter  used  to 

control  the  field.  (d,e)  The  estimated  biases  {hi  }  from  two  experiments  k  =  1,  2.  The  first  is  without 
correction;  the  second  was  corrected  using  the  biases  estimated  from  the  first.  Note  that  the  distribution 
significantly  more  narrowed  and  centered  near  zero  after  correction,  (f)  The  average  over  the  qubits  of  the 
quantites  jfiq  for  different  values  of  the  programmed  h^\  with  error  bands  given  by  the  standard  deviation.  As 
for  (d,e),  note  the  narrowing  and  centering  of  the  distributions. 


h\ p)  =  hfM  =  h(p’0)  - 

'  '  k  (4) 

Figure  l(d,e)  show  the  distribution  of  T<hk> }  for  the  first  and  second  iterations,  i.e.  before  and 

after  the  recalibration  procedure.  The  narrowing  of  the  distribution  is  a  clear  indication  that  the  procedure  is 
working  to  remove  the  biases. 

Further  support  for  the  success  of  recalibration  is  provided  by  looking  at  the  distribution  over  the  qubits  of  the 
success  probabilities  for  all  values  of  /z(p).  Figure  1(f)  shows  a  uniform  reduction  of  the  variance  over  the  qubits  in 
the  values  of  h.  Each  point  is  the  mean  h  over  the  qubits  of  {fij  for  the  corresponding  h^\  and  the  shaded  region 
indicates  the  standard  deviaton.  The  narrowing  of  the  distribution  is  clear  evidence  that  the  recalibration  procedure 
not  only  narrows  the  distribution  of  the  biases  [Fig.  l(d,e)],  as  reflected  in  the  shift  of  the  mean,  but  also  reduces 
the  variance  for  all  values  of  fi(p). 

Determination  of  the  J  biases.  In  the  data  presented  here,  the /biases  were  determined  using  (7).  The 

programmable  value  of  the  local  fields  { h ^ }  were  uniformly  set  to  zero  (see  Methods  section  for  more  details). 

Figure  2(a)  shows  the  median  quantity  dP^  for  evenly  spaced  values  of  /(p)  in  [—0.1,  0.1],  as  well  as  a  line  fit 
thereto.  The  closeness  of  the  fit  of  the  line  confirms  the  accuracy  of  the  thermal  model  as  for  cfh\ 

Figure  2(b),  analogous  to  Fig.  S2(b),  shows  the  standard  deviation  aj  over  the  qubits  of  the  estimates 

{/[  ■  }  =  {a[k)  T ^ }  of  {Jn}  for  different  values  of  /p),  for  each  of  the  three  iterations  of  experiments  as  described 
above.  The  estimates  {/••  }  were  calculated  independently  for  each  iteration  k  using  its  mean  effective  temperature 

_ tj  J 

T  .  There  is  a  significant  change  from  the  first  iteration  of  corrections,  but  then  the  standard  deviation  remains 
approximately  the  same  after  the  second  iteration.  Considering  the  average  over  the  values  of  /p),  the  overall 
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Figure  2.  /biases  in  the  Burnaby  device.  Data  from  a  series  of  three  experiments  ( k  =  1,  2,  3)  in  which  for 
each  experiment  the  sums  of  biases  estimated  in  the  previous  ones  are  subtracted  from  the  original 
programmed  values,  using  Eq.  5.  The  first  experiment  is  without  any  correction,  and  the  second  and  third 
use  increasingly  accurate  corrections.  All  quantities  are  calculated  using  the  mean  of  the  qubit 

temperatures,  calculated  indepently  in  each  experiment,  (a)  From  only  the  first  experiment,  the  median 
quantity  ),  over  the  couplers,  of  the  quantities  )}  for  evenly  spaced  values  of  /(p)  in  [—0.1, 

0.1].  (b)  The  standard  deviation  over  the  couplers  of  the  estimated  J ^  at  each  value  of  the  original  /(p,0). 

ij  -(r) 

(c)  For  a  single,  typical  coupler,  (41,  47),  the  st^ji^ard  deviation  over  100  runs  of  the  estimates  {/41  47}  of  /41 47 
versus  the  original  /(p,0).  (d-f)  Residual  biases  ]  .. 7  ;  estimated  from  each  of  the  experiments. 


variance  is  about  the  same  before  and  after  correction,  yet  is  much  more  uniform  after  correction,  which  we  con¬ 
sider  beneficial. 

Figure  2(c),  analogous  to  Fig.  Sl(c),  shows  the  standard  deviation  afu%  of  the  estimates  {/41^47}.  The  correction 
seems  to  have  no  effect  in  this  regard,  and  the  there  is  a  consistent  increase  in  the  variance  with  increasing  mag¬ 
nitude  of  the  programmed  value  /(p).  A  similar  phenomenon  occurred  in  the  analogous  h  data. 

Define  to  be  the  estimate  T ^  for  using  the  data  from  the  A:-th  iteration,  with  /(.b,°)  =  0  by  con¬ 

vention.  In  the  /cth  iteration,  we  set  the  programmed  values  of  the  couplers  to 

r(p)  =  r(p,fc)  t( p,0)  _  yy  Rb,fc') 

Jij  Jij  J  Z_^Hj  >  /r\ 

k'=  1  UV 

i.e.  by  subtracting  the  sums  of  the  residual  biases  from  the  prior  iterations  from  the  desired  values.  Figure  2(d-f) 
show  the  narrowing  of  the  distribution  of  residual  /  biases  with  correction. 

Unlike  the  case  for  the  h  biases,  for  which  data  indicate  that  the  distribution  is  essentially  converged  after  a  single 
iteration,  here  we  see  two  new  phenomena.  First,  the  distribution  continues  to  narrow  between  the  second  and  third 
iterations.  Second,  the  distribution  of  the  residual  biases  from  the  second  iteration,  while  narrower  than  that  from 
the  first,  is  not  centered  around  zero.  We  believe  this  is  due  to  over  correction;  that  is,  the  estimates  of  the  biases 
from  the  first  iteration  have  a  high  degree  of  uncertainty,  and  so  simply  subtracting  their  values  from  the  intended 
value  introduces  some  amount  of  bias  itself.  This  is  consistent  with  the  overall  small  magnitudes  of  the  /  biases 
relative  to  those  of  the  h  biases,  especially  as  compared  to  the  corresponding  noise  levels.  This  overcorrection  can 
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Range  r7 

1 

2 

4 

8 

16 

Greedy 

0.58 

0.63 

0.59 

0.68 

0.53 

Elite  mean 

0.65 

0.65 

0.73 

0.72 

0.67 

Table  1 .  Comparison  of  performance  with  and  with  h-c orrection  on  benchmarks.  The  probability  that 
correcting  for  h  biases  (using  data  from  a  single  experiment)  improved  performance  on  100  random  instances 
from  an  ensemble  parameterized  by  the  range  of  values  r.  Performance  was  compared  according  to  two  metrics: 
greedy  comparison  of  the  energies  and  degeneracies,  and  comparison  of  the  elite  mean  score  function. 


be  mitigated  by  weighting  the  correction  in  a  way  that  accounts  for  the  uncertainty  in  the  estimate  using  Bayesian 
reasoning.  We  will  explore  these  ideas  further  in  future  work. 


Effect  of  correction  on  the  performance  of  benchmark  problems.  Ultimately,  the  goal  of  calibration 
is  to  optimize  the  performance  of  a  quantum  annealer  on  problems  of  computational  interest.  It  is  not  clear  a  priori 
that  the  biases  present  in  one-  and  two-qubit  experiments  are  the  same  as  those  present  in  anneals  involving  hun¬ 
dreds  of  qubits.  Even  if  they  were,  their  estimation  would  be  of  no  practical  value  unless  their  correction  improves 
performance.  To  address  this,  we  tested  the  effect  of  correcting  the  h  biases  on  the  performance  of  the  quantum 
annealer  at  NASA  Ames,  using  the  same  parameterized  random  ensemble  of  instances  used  in  a  previous  study 
benchmarking  a  D-Wave  quantum  annealer9.  As  in  those  studies,  r  is  a  parameter  that  tunes  the  difficulty  of  the 
average  instance  (the  larger  the  r  the  more  difficult  the  average  instance).  We  expected  the  recalibration  to  have  a 
major  positive  impact  in  the  harder  family  of  instances. 

For  each  instance,  the  uncorrected  and  corrected  results  were  compared  using  two  methods,  a  “greedy”  one  and 
the  elite  mean.  The  results  of  the  comparison  are  summarized  in  Table  1,  showing  the  proportion  of  instances,  for 
each  r,  for  which  the  correction  improved  the  performance,  using  each  of  the  two  comparison  methods  described 
above.  The  data  set  indicates  that  correcting  for  the  h  biases  improves  performance  according  to  these  two  rea¬ 
sonable  metrics.  At  a  large  enough  range  r,  however,  even  correction  of  the  biases  is  not  enough.  In  this  limit  of 
large  r,  the  spacing  of  1/r  between  the  different  specifiedd  /  values  is  beyond  the  precision  of  the  device  and  poorly 
resolved.  In  this  limit  inherent  fluctuations  lead  to  almost  zero  success  probabilities  due  to  problem  misspecifica- 
tion,  i.e.,  the  device  is  finding  the  solution  to  another  problem  different  from  the  one  indicated.  This  would  explain 
the  possible  pattern  seen  in  the  elite  mean  comparison  (Table  1)  that  the  advantage  of  correction  peaks  seems  to 
peak  at  the  level  of  r  considered  to  correspond  to  the  precision  limit  of  the  device.  (That  such  a  pattern  is  not  as 
apparent  in  the  greedy  comparison  is  easily  explained  by  natural  noisiness  of  that  comparison  method,  especially 
for  instances  with  extremely  low  success  probability  as  was  the  case  here). 

Discussion 

Disentangling  the  mutual  effect  of  the  h  and  /  biases  on  each  other  by  alternating  between  iterations  of  the  iterations 
of  h  and  /  experiments  (as  opposed  to  doing  each  alone  as  reported)  will  likely  lead  to  more  accurate  estimates  of 
each  individually  Lastly,  the  risk  of  overcorrection  can  be  mitigated  by  weighting  the  correction  by  the  degree  of 
certainty  of  the  estimate  of  the  bias  to  be  corrected. 

Although  we  focused  initially  on  a  standard  random  ensemble  of  Ising  instances  for  benchmarking  the  perfor¬ 
mance  of  quantum  annealers,  the  effect  of  correcting  biases  should  be  greatest  on  instances  whose  ground  states 
are  most  sensitive  to  misspecification  of  the  programmable  parameters20,21. 

There  is  reason  to  suspect  that  correction  will  also  have  a  beneficial  effect  in  reducing  the  effect  of  gauge  selec¬ 
tion  on  success  probability  While  there  are  other  suspected  reasons  for  the  effect  of  gauge  selection  (which  would 
be  non-existent  in  an  ideal  device),  biases  such  as  the  ones  corrected  here  could  be  one  of  the  leading  factors.  The 
effect  of  gauge  selection  is  significant,  sometimes  leading  to  an  orders-of-magnitude  difference  in  the  success 
probabilities,  and  so  this  is  a  promising  application  for  bias  correction. 

Importantly,  while  the  /  biases  determined  here  are  in  general  smaller  than  the  h  biases,  numerical  studies 
indicate  that  often  instances  are  more  sensitive  to  misspecification  in  the  /  parameters  than  in  the  h  parameters20. 

The  methods  presented  here  complement  a  growing  suite  of  tools  for  optimal  programming  of  quantum 
annealers23,24,  tuning  the  performance  thereof  to  cope  with  the  intrinsic  noise  in  current  and  future  physical 
implementations. 


Methods 

Calculation  of  the  h  biases.  To  experimentally  determine  the  biases  { h ^ },  we  set  all  / =  0  and  initially 
assume  the  effect  of  nonzero  {J^ }  to  be  negligible.  We  therefore  ran  the  experiments  for  all  qubits  within  a  given 
device  simultaneously,  with  the  same  value  of  =  h ^  for  every  working  qubit.  Each  value  of  /z(p)  was  run  ioo 
times,  where  each  run  consisted  of  1,000  annealing  cycles.  The  probability  p^  (  f  )  was  calculated  for  each  run  r, 
and  the  median  probability  p  ( j \h& )  taken  over  the  100  {p^  ( j \h^ ) }  calculated.  From  this,  we  define  the 
“median”  at  (h& )  =  a(p.(t  |/z/p^))  =  0.5  In  [1  -  p.(T|^f  )]  -  0.5  In  p.(||/z/p^).  Technically,  this  is  a  slight 

abuse  of  terminology;  while  the  median  of  {a  (p^  ) }  is  almost  the  same  as  a  (p.)  because  the  function  a(p)  is 
monotonic,  the  two  quantities  can  differ  slightly  in  the  case  of  an  even  number  of  values. 

Since  cq  =  hj-^f  Tt  +  h\b\ /  for  each  qubit,  we  fit  a  line  to  { (h\v\  cq) }  by  minimizing  the  quadratic  loss;  the 
resulting  slope  gives  us  an  estimated  inverse  qubit  temperature  j3i=  1/  Tf,  and  from  the  intercept  a[b^  we  can 
determine  the  bias  in  several  ways. 
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One  way  is  to  simply  use  the  fitted  parameters  as  is:  ht  =  TP  Experimental  data,  however,  indicate  that  the 
estimates  of  the  qubit  “temperatures”  calculated  as  above  are  not  exactly  that,  but  include  in  their  calculation  effects 
other  than  that  due  to  true  variation  in  temperature  between  the  qubits.  Some  estimate  of  a  uniform  device  temper¬ 
ature  should  therefore  be  used.  (See  Sec.  Ill  in  the  SI  for  more  detail.)  In  our  experiments,  we  used  two  different 

quantities.  The  first  is  the  “mean  temperature”,  =  -V”=1  Tp  where  Tt  =  1/  and  n  is  the  number  of  (working) 

„~(h)  n  ~ 

qubits.  The  second  is  the  “median  temperature”  T  ,  called  thus  not  because  it  is  the  median  of  {Tf}  but  because  it 
is  calculated  by  taking  the  inverse  of  the  slope  of  the  line  fit  to  the  points  { (hP\  ) },  where  ( h ^ )  is  defined 

as  the  median  over  the  qubits  of  {at  ( h ^ ) }.  In  practice,  the  quantities  ^  and  T ^  are  effectively  the  same. 

Calculation  of  the  J  biases.  Let  W  +  fip  be  the  effective  value  of  the  coupling  between  qubits  i  and 
j  and  p..  (|||/z/p\  hj^\  hjp\  h^\  J jp )  be  the  probability  of  qubits  i  and  j  both  being  in  the  spin-up  state, 

where  fip  is  the  programmed  value  of  the  coupler  and  J ip  the  bias,  analogous  to  h  P  and  hf^\  respectively  Here, 
we  set  hP  =  =0,  and  write  simply  p..  (TT 1/^ )  =  p.. (T  T I  hf\  jjf\  iP )  Other  probabilities  p..  (•  .  |  /^) 

and  combinations  thereof  such  as  p  (TT  Vjj|/|p^ )  =  p{.  (TT  \Jp )  +  (li|  Jp  )> are  analogously  denoted. 

One  approach  to  determine  the  bias  jp  is  to  naively  assume  that  h P  =  h P  =  0,  in  which  case  the  thermal 
distribution  is  modeled  by 


^(TTvUI/f) 


2e 


-a~(jk 
1  y 


(p)i 


2e 


a-.(j & 
vyv 


(p)) 


'  +  2e 


(p)) 


(6) 


where  (/|p^ )  =  J^j T-  =  ( J p  +  jp  )/TT.  For  concision,  we  leave  the  dependence  of p^  and  on  ht,  hp  Jp\ 

and  Ty  implicit.  Similarly  to  the  case  for  hp  this  yields 


<*ij  (4p))  =  \ ln 


1  -  Pjjp  t  V  I  I) 

P9(  T  T  V  II) 


(7) 


For  a  given  value  of  /(p),  the  experiment  was  run  in  six  batches.  In  each  batch,  the  programmed  coupling  j(p 
was  uniformly  set  to  /p)  for  each  coupler  of  a  pairwise  disjoint  subset  of  all  of  the  couplers,  and  for  each  of  the  rest 
jp  was  set  to  zero.  Over  the  six  batches,  each  coupling  was  set  to  /(p)  exactly  once.  As  for  the  h  biases,  each 
value  of/p)  was  run  100  times  (for  each  coupler),  with  each  run  consisting  of  1,000  annealing  cycles.  The  median 
probability  p..  (| T V|||/|p^ )  was  taken  over  the  100  {pp  }  calculated  from  the  runs,  from  which  we  calculate  the 
“median”  pp )  =  a  Qf(TTVj  j|/|p))).For  each  coupler,  a  line  was  fit  to  {pp ,  atj )},  yielding  a  slope  =  1/Ttj 

and  an  intercept  o$\  As  for  h ,  we  define  the  mean  temperature  =  —  pip  TT ,  where  m  is  the  number  of 
couplers. 

A  more  accurate  estimate  for  can  be  obtained  by  considering  nonzero  h P  and  (but  still  setting 
fc/p)  =  hjp)  =  0).  Let 

Zsg  e(5^P+5^b)+^5;(^)+^)))/Ty 

si}s;-e{±i}  (8) 

be  the  partition  function.  Then 


Py(TT)Pg(U) 

Py(TI)p!y(IT)  (9) 


e-  (hf’hhfhh,)  /  Tljg- (-h^-hfhUj) /T,J  !Z1 
-  /  Tljg- (-kW+hW-J^/T,]  jZl 


(10) 


or 


(ii) 


«,;(4P)’  jf\  T{j)  =  i  ln 


P_jji  TDpyOT) 

P  ij  (tt  )Py  (ll)  * 


(12) 


Note  that  the  assumption  that  hP  =  =  0  implies  thatp^TT)  =Pij(ll)  andpy( T l)=py( lt)»  an(i  that  in  this 

case  (12)  reduces  to  (7). 


Benchmark  studies.  To  generate  a  single  instance  with  range  r,  the  local  fields  {/z2j  were  uniformly  set  to 
zero,  and  each  available  Jy  was  independently  and  uniformly  selected  from  {— r,  —  r  +  1,  ...,—1,0, 1, . . .,  r}.  The 
resulting  instance  was  then  scaled  by  the  overall  factor  0.9/r  so  that  the  largest  magnitude  |/y  |  was  0.9.  (This  was 
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necessary,  rather  than  scaling  to  1  as  in  previous  studies,  to  allow  for  consistency  with  future  experiments  in  which 
the  /biases  are  corrected.)  100  such  instances  were  generated  and  run  twice  with  1,000  annealing  cycles  for  each 
of  the  same  (uniformly  randomly  generated)  10  gauges.  In  all  runs,  {/y  }  were  programmed  as  in  the  instances.  For 
the  first  set  of  runs,  which  we  call  “uncorrected”  the  {hf  were  also  programmed  as  in  the  instances,  i.e.  to  zero.  For 
the  other  set,  which  we  call  “fi-corrected”,  the  local  fields  were  programmed  to  the  inverse  of  the  biases  computed 
via  experiments  as  in  Eq.  4. 

Greedy  and  elite  mean  metrics.  The  greedy  comparison  is  as  follows:  the  energies  of  all  states  returned  were  com¬ 
puted,  and  those  for  all  gauges  were  grouped  together.  Whichever  method  (uncorrected  or  corrected)  returned 
the  lower  minimum  energy  was  deemed  to  have  performed  better.  If  the  minimum  energies  were  the  same,  the 
tie  was  broken  by  the  number  of  times  that  energy  was  returned.  If  this  number  was  the  same,  the  method  with 
the  second-lowest  energy  was  deemed  to  have  performed  better,  with  ties  broken  by  the  number  of  times  the 
second-lowest  energy  was  returned,  and  so  on.  The  “elite  mean”  score  function23,  a  quantity  previously  introduced 
to  allow  comparison  of  the  performance  of  different  programming  parameters  in  quantum  annealers  when  the 
success  probabilities  are  too  low  (and  thus  noisy),  is  defined  as  the  mean  energy  of  the  “elite”  states,  i.e.  those  with 
the  lowest  energies.  The  elite  mean  is  parameterized  by  the  fraction  of  energies  over  which  to  take  the  mean;  here 
we  use  2%.  For  r=  1  and  r  =  2,  there  were  6  and  2  instances,  respectively,  for  which  the  elite  mean  comparison  was 
tied,  all  but  one  due  to  success  probabilities  greater  than  2%  for  both  the  corrected  and  uncorrected  experiments. 
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